Deep Belief Nets as Function Approximators for Reinforcement Learning
نویسندگان
چکیده
منابع مشابه
Deep Belief Nets as Function Approximators for Reinforcement Learning
We describe a continuous state/action reinforcement learning method which uses deep belief networks (DBNs) in conjunction with a value function-based reinforcement learning algorithm to learn effective control policies. Our approach is to first learn a model of the state-action space from data in an unsupervised pretraining phase, and then use neural-fitted Q-iteration (NFQ) to learn an accurat...
متن کاملStructured Control Nets for Deep Reinforcement Learning
In recent years, Deep Reinforcement Learning has made impressive advances in solving several important benchmark problems for sequential decision making. Many control applications use a generic multilayer perceptron (MLP) for non-vision parts of the policy network. In this work, we propose a new neural network architecture for the policy network representation that is simple yet effective. The ...
متن کاملDeep Belief Networks Are Compact Universal Approximators
Deep Belief Networks (DBN) are generative models with many layers of hidden causal variables, recently introduced by Hinton et al. (2006), along with a greedy layer-wise unsupervised learning algorithm. Building on Le Roux and Bengio (2008) and Sutskever and Hinton (2008), we show that deep but narrow generative networks do not require more parameters than shallow ones to achieve universal appr...
متن کاملUsing Temporal Neighborhoods to Adapt Function Approximators in Reinforcement Learning
To avoid the curse of dimensionality, function approximators are used in reinforcement learning to learn value functions for individual states. In order to make better use of computational resources (basis functions) many researchers are investigating ways to adapt the basis functions during the learning process so that they better t the value-function landscape. Here we introduce temporal neig...
متن کاملConvergence of Reinforcement Learning with General Function Approximators
A key open problem in reinforcement learning is to assure convergence when using a compact hypothesis class to approximate the value function. Although the standard temporal-difference learning algorithm has been shown to converge when the hypothesis class is a linear combination of fixed basis functions, it may diverge with a general (nonlinear) hypothesis class. This paper describes the Bridg...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Frontiers in Computational Neuroscience
سال: 2011
ISSN: 1662-5188
DOI: 10.3389/conf.fncom.2011.52.00029